{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Derivatives fundamentals\n",
    "\n",
    "This notebook will introduce you to the fundamentals of computing the derivative of the solution map to optimization problems. The derivative can be used for **sensitvity analysis**, to see how a solution would change given small changes to the parameters, and to compute **gradients** of scalar-valued functions of the solution.\n",
    "\n",
    "In this notebook, we will consider a simple disciplined geometric program. The geometric program under consideration is\n",
    "\n",
    "$$\n",
    "\\begin{equation}\n",
    "\\begin{array}{ll}\n",
    "\\mbox{minimize} & 1/(xyz) \\\\\n",
    "\\mbox{subject to} & a(xy + xz + yz) \\leq b\\\\\n",
    "& x \\geq y^c,\n",
    "\\end{array}\n",
    "\\end{equation}\n",
    "$$\n",
    "\n",
    "where $x \\in \\mathbf{R}_{++}$, $y \\in \\mathbf{R}_{++}$, and $z \\in \\mathbf{R}_{++}$ are the variables, and $a \\in \\mathbf{R}_{++}$, $b \\in \\mathbf{R}_{++}$ and $c \\in \\mathbf{R}$ are the parameters. The vector\n",
    "$$\n",
    "\\alpha = \\begin{bmatrix} a \\\\ b \\\\ c \\end{bmatrix}\n",
    "$$\n",
    "is the vector of parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import cvxpy as cp\n",
    "\n",
    "\n",
    "x = cp.Variable(pos=True)\n",
    "y = cp.Variable(pos=True)\n",
    "z = cp.Variable(pos=True)\n",
    "\n",
    "a = cp.Parameter(pos=True)\n",
    "b = cp.Parameter(pos=True)\n",
    "c = cp.Parameter()\n",
    "\n",
    "objective_fn = 1/(x*y*z)\n",
    "objective = cp.Minimize(objective_fn)\n",
    "constraints = [a*(x*y + x*z + y*z) <= b, x >= y**c]\n",
    "problem = cp.Problem(objective, constraints)\n",
    "\n",
    "problem.is_dgp(dpp=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice the keyword argument `dpp=True`. The parameters must enter in the DGP problem acording to special rules, which we refer to as `dpp`. The DPP rules are described in an [online tutorial](https://www.cvxpy.org/tutorial/advanced/index.html#disciplined-parametrized-programming).\n",
    "\n",
    "Next, we solve the problem, setting the parameters $a$, $b$ and $c$ to $2$, $1$, and $0.5$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.5612147353889386\n",
      "0.31496200373359456\n",
      "0.36892055859991446\n"
     ]
    }
   ],
   "source": [
    "a.value = 2.0\n",
    "b.value = 1.0\n",
    "c.value = 0.5\n",
    "problem.solve(gp=True, requires_grad=True)\n",
    "\n",
    "print(x.value)\n",
    "print(y.value)\n",
    "print(z.value)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice the keyword argument ``requires_grad=True``; this is necessary to subsequently compute derivatives."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Solution map\n",
    "\n",
    "\n",
    "The **solution map** of the above problem is a function\n",
    "$$\\mathcal{S} : \\mathbf{R}^2_{++} \\times \\mathbf{R} \\to \\mathbf{R}^3_{++}$$\n",
    "which maps the parameter vector to the vector of optimal solutions\n",
    "$$\n",
    "\\mathcal S(\\alpha) = \\begin{bmatrix} x(\\alpha) \\\\ y(\\alpha) \\\\ z(\\alpha)\\end{bmatrix}.\n",
    "$$\n",
    "\n",
    "Here, $x(\\alpha)$, $y(\\alpha)$, and $z(\\alpha)$ are the optimal values of the variables corresponding to the parameter vector.\n",
    "\n",
    "As an example, we just saw that\n",
    "$$\n",
    "\\mathcal S((2.0, 1.0, 0.5)) = \\begin{bmatrix} 0.5612 \\\\ 0.3150 \\\\ 0.3690 \\end{bmatrix}.\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Sensitivity analysis\n",
    "\n",
    "When the solution map is differentiable, we can use its derivative\n",
    "$$\n",
    "\\mathsf{D}\\mathcal{S}(\\alpha) \\in \\mathbf{R}^{3 \\times 3}\n",
    "$$\n",
    "to perform a **sensitivity analysis**, which studies how the solution would change given small changes to the parameters.\n",
    "\n",
    "Suppose we perturb the parameters by a vector of small magnitude $\\mathsf{d}\\alpha \\in \\mathbf{R}^3$. We can approximate the change $\\Delta$ in the solution due to the perturbation using the derivative, as\n",
    "$$\n",
    "\\Delta = \\mathcal{S}(\\alpha + \\mathsf{d}\\alpha) - \\mathcal{S}(\\alpha) \\approx \\mathsf{D}\\mathcal{S}(\\alpha) \\mathsf{d}\\alpha.\n",
    "$$\n",
    "\n",
    "We can compute this in CVXPY, as follows.\n",
    "\n",
    "Partition the perturbation as\n",
    "$$\n",
    "\\mathsf{d}\\alpha = \\begin{bmatrix} \\mathsf{d}a \\\\ \\mathsf{d}b \\\\ \\mathsf{d}c\\end{bmatrix}.\n",
    "$$\n",
    "\n",
    "We set the ``delta`` attributes of the parameters to their perturbations, and then call the ``derivative`` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "da, db, dc = 1e-2, 1e-2, 1e-2\n",
    "\n",
    "a.delta = da\n",
    "b.delta = db\n",
    "c.delta = dc\n",
    "problem.derivative()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The ``derivative`` method populates the ``delta`` attributes of the variables as a side-effect, with the predicted change in the variable. We can compare the predictions to the actual solution of the perturbed problem."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x: predicted 0.55729 actual 0.55734\n",
      "y: predicted 0.31783 actual 0.31783\n",
      "z: predicted 0.37179 actual 0.37175\n"
     ]
    }
   ],
   "source": [
    "x_hat = x.value + x.delta\n",
    "y_hat = y.value + y.delta\n",
    "z_hat = z.value + z.delta\n",
    "\n",
    "a.value += da\n",
    "b.value += db\n",
    "c.value += dc\n",
    "\n",
    "problem.solve(gp=True)\n",
    "print('x: predicted {0:.5f} actual {1:.5f}'.format(x_hat, x.value))\n",
    "print('y: predicted {0:.5f} actual {1:.5f}'.format(y_hat, y.value))\n",
    "print('z: predicted {0:.5f} actual {1:.5f}'.format(z_hat, z.value))\n",
    "\n",
    "a.value -= da\n",
    "b.value -= db\n",
    "c.value -= dc"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this case, the predictions and the actual solutions are fairly close."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Gradient\n",
    "\n",
    "We can compute gradient of a scalar-valued function of the solution with respect to the parameters. Let $f : \\mathbf{R}^{3} \\to \\mathbf{R}$, and suppose we want to compute the gradient of the composition $f \\circ \\mathcal S$. By the chain rule,\n",
    "$$\n",
    "\\nabla f(S(\\alpha)) = \\mathsf{D}^T\\mathcal{S}(\\alpha) \\begin{bmatrix}\\mathsf{d}x \\\\ \\mathsf{d}y \\\\ \\mathsf{d}z\\end{bmatrix},\n",
    "$$\n",
    "where $\\mathsf{D}^T\\mathcal{S}$ is the adjoint (or transpose) of the derivative operator, and $\\mathsf{d}x$, $\\mathsf{d}y$, and $\\mathsf{d}z$ are the partial derivatives of $f$ with respect to its arguments.\n",
    "\n",
    "We can compute the gradient in CVXPY, using the ``backward`` method. As an example, suppose\n",
    "$$\n",
    "f(x, y, z) = \\frac{1}{2}(x^2 + y^2 + z^2),\n",
    "$$\n",
    "so that $\\mathsf{d}x = x$, $\\mathsf{d}y = y$, and $\\mathsf{d}z = z$. Let $\\mathsf{d}\\alpha = \\nabla f(S(\\alpha))$,\n",
    "and suppose we subtract $\\eta \\mathsf{d}\\alpha$ from the parameter, where $\\eta$ is a positive constant. Using the following code, we can compare $f(\\mathcal S(\\alpha - \\eta \\mathsf{d}\\alpha))$ with the value predicted by the gradient,\n",
    "$$\n",
    "f(\\mathcal S(\\alpha - \\eta \\mathsf{d}\\alpha)) \\approx f(\\mathcal S(\\alpha)) - \\eta \\mathsf{d}\\alpha^T\\mathsf{d}\\alpha.\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "original 0.27513 predicted 0.22709 actual 0.22942\n"
     ]
    }
   ],
   "source": [
    "problem.solve(gp=True, requires_grad=True)\n",
    "\n",
    "def f(x, y, z):\n",
    "    return 1/2*(x**2 + y**2 + z**2)\n",
    "\n",
    "original = f(x, y, z).value\n",
    "\n",
    "x.gradient = x.value\n",
    "y.gradient = y.value\n",
    "z.gradient = z.value\n",
    "problem.backward()\n",
    "\n",
    "eta = 0.5\n",
    "dalpha = cp.vstack([a.gradient, b.gradient, c.gradient])\n",
    "predicted = float((original - eta*dalpha.T @ dalpha).value)\n",
    "\n",
    "a.value -= eta*a.gradient\n",
    "b.value -= eta*b.gradient\n",
    "c.value -= eta*c.gradient\n",
    "problem.solve(gp=True)\n",
    "actual = f(x, y, z).value\n",
    "\n",
    "print('original {0:.5f} predicted {1:.5f} actual {2:.5f}'.format(\n",
    "       original, predicted, actual))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
